AITopics | ai workload

Collaborating Authors

ai workload

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stay In Control Of AI With The MSI Cubi NUC AI 3MG Mini PC

PCWorldJun-26-2026, 17:47:46 GMT

When you purchase through links in our articles, we may earn a small commission. Run private, reliable local AI workflows with MSI's Cubi NUC AI+ 3MG mini PC, built for compact on-device AI, agents, and secure automation. Agentic AI is the current buzzword of the day, but if you're chatting with ChatGPT, having Gemini make images for you, or coding with Claude, then you're almost exclusively interacting with a cloud AI ecosystem. This is great for the most capable, frontier models, but it's dependent on server uptime, gives few guarantees for privacy and security, and once you start running agents, token costs can quickly run away from you. For smaller, everyday AI tasks, persistent agentic work, or partnering with more powerful cloud or local AI tools, small-scale local AI PCs can be ideal.

home robotic performance privacy productivity, large language model, machine learning, (16 more...)

PCWorld

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Beyond Connectivity: An Open Architecture for AI-RAN Convergence in 6G

Polese, Michele, Mohamadi, Niloofar, D'Oro, Salvatore, Bonati, Leonardo, Melodia, Tommaso

arXiv.org Artificial IntelligenceDec-3-2025

Abstract--Data-intensive Artificial Intelligence (AI) applications at the network edge demand a fundamental shift in Radio Access Network (RAN) design, from merely consuming AI for network optimization, to actively enabling distributed AI workloads. This presents a significant opportunity for network operators to monetize AI while leveraging existing infrastructure. T o realize this vision, this article presents a novel converged O-RAN and AI-RAN architecture for unified orchestration and management of telecommunications and AI workloads on shared infrastructure. The proposed architecture extends the Open RAN principles of modularity, disaggregation, and cloud-nativeness to support heterogeneous AI deployments. We introduce two key architectural innovations: (i) the AI-RAN Orchestrator, which extends the O-RAN Service Management and Orchestration (SMO) to enable integrated resource and allocation across RAN and AI workloads; and (ii) AI-RAN sites that provide distributed edge AI platforms with real-time processing capabilities. The proposed architecture enables flexible orchestration, meeting requirements for managing heterogeneous workloads at different time scales while maintaining open, standardized interfaces and multi-vendor interoperability.This paper has been submitted to IEEE for publication. M. Polese, L. Bonati, and T. Melodia are with the Institute for the Wireless Internet of Things, Northeastern University, Boston, MA, USA. This article is based upon work partially supported by the NTIA PWSCIF under A ward No. 25-60-IF054, the U.S. NSF under award CNS-2112471, and by OUSD(R&E) through Army Research Laboratory Cooperative Agreement Number W911NF-24-2-0065.

large language model, machine learning, workload, (20 more...)

arXiv.org Artificial Intelligence

2507.06911

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.24)

Genre: Research Report (0.50)

Industry:

Information Technology > Networks (0.48)
Government > Military > Army (0.34)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Distributed Architectures (0.35)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Tensor Program Optimization for the RISC-V Vector Extension Using Probabilistic Programs

Peccia, Federico Nicolas, Haxel, Frederik, Bringmann, Oliver

arXiv.org Artificial IntelligenceAug-20-2025

RISC-V provides a flexible and scalable platform for applications ranging from embedded devices to high-performance computing clusters. Particularly, its RISC-V Vector Extension (RVV) becomes of interest for the acceleration of AI workloads. But writing software that efficiently utilizes the vector units of RISC-V CPUs without expert knowledge requires the programmer to rely on the autovectorization features of compilers or hand-crafted libraries like muRISCV-NN. Smarter approaches, like autotuning frameworks, have been missing the integration with the RISC-V RVV extension, thus heavily limiting the efficient deployment of complex AI workloads. In this paper, we present a workflow based on the TVM compiler to efficiently map AI workloads onto RISC-V vector units. Instead of relying on hand-crafted libraries, we integrated the RVV extension into TVM's MetaSchedule framework, a probabilistic program framework for tensor operation tuning. We implemented different RISC-V SoCs on an FPGA and tuned a wide range of AI workloads on them. We found that our proposal shows a mean improvement of 46% in execution latency when compared against the autovectorization feature of GCC, and 29% against muRISCV-NN. Moreover, the binary resulting from our proposal has a smaller code memory footprint, making it more suitable for embedded devices. Finally, we also evaluated our solution on a commercially available RISC-V SoC implementing the RVV 1.0 Vector Extension and found our solution is able to find mappings that are 35% faster on average than the ones proposed by LLVM. We open-sourced our proposal for the community to expand it to target other RISC-V extensions.

machine learning, muriscv -nn, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.01457

Country: Europe > Germany (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Architecture (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Load Balancing for AI Training Workloads

McClure, Sarah, Ratnasamy, Sylvia, Shenker, Scott

arXiv.org Artificial IntelligenceJul-30-2025

We investigate the performance of various load balancing algorithms for large-scale AI training workloads that are running on dedicated infrastructure. The performance of load balancing depends on both the congestion control and loss recovery algorithms, so our evaluation also sheds light on the appropriate choices for those designs as well.

artificial intelligence, packet, workload, (14 more...)

arXiv.org Artificial Intelligence

2507.21372

Genre: Research Report > New Finding (0.68)

Industry: Energy > Power Industry (0.88)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, Arizona

Colangelo, Philip, Coskun, Ayse K., Megrue, Jack, Roberts, Ciaran, Sengupta, Shayan, Sivaram, Varun, Tiao, Ethan, Vijaykar, Aroon, Williams, Chris, Wilson, Daniel C., MacFarland, Zack, Dreiling, Daniel, Morey, Nathan, Ratnayake, Anuja, Vairamohan, Baskar

arXiv.org Artificial IntelligenceJul-2-2025

Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innovation as data centers wait for interconnection to constrained grids. This paper presents the first field demonstration, in collaboration with major corporate partners, of a software-only approach--Emerald Conductor--that transforms AI data centers into flexible grid resources that can efficiently and immediately harness existing power systems without massive infrastructure buildout. Conducted at a 256-GPU cluster running representative AI workloads within a commercial, hyperscale cloud data center in Phoenix, Arizona, the trial achieved a 25% reduction in cluster power usage for three hours during peak grid events while maintaining AI quality of service (QoS) guarantees. By orchestrating AI workloads based on real-time grid signals without hardware modifications or energy storage, this platform reimagines data centers as grid-interactive assets that enhance grid reliability, advance affordability, and accelerate AI's development.

artificial intelligence, cloud computing, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.00909

Country: North America > United States > Arizona > Maricopa County > Phoenix (0.61)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (1.00)
Energy (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Efficient Unified Caching for Accelerating Heterogeneous AI Workloads

Wang, Tianze, Liu, Yifei, Chen, Chen, Zuo, Pengfei, Zhang, Jiawei, Weng, Qizhen, Chen, Yin, Han, Zhenhua, Zhao, Jieru, Chen, Quan, Guo, Minyi

arXiv.org Artificial IntelligenceJun-17-2025

Modern AI clusters, which host diverse workloads like data pre-processing, training and inference, often store the large-volume data in cloud storage and employ caching frameworks to facilitate remote data access. To avoid code-intrusion complexity and minimize cache space wastage, it is desirable to maintain a unified cache shared by all the workloads. However, existing cache management strategies, designed for specific workloads, struggle to handle the heterogeneous AI workloads in a cluster -- which usually exhibit heterogeneous access patterns and item storage granularities. In this paper, we propose IGTCache, a unified, high-efficacy cache for modern AI clusters. IGTCache leverages a hierarchical access abstraction, AccessStreamTree, to organize the recent data accesses in a tree structure, facilitating access pattern detection at various granularities. Using this abstraction, IGTCache applies hypothesis testing to categorize data access patterns as sequential, random, or skewed. Based on these detected access patterns and granularities, IGTCache tailors optimal cache management strategies including prefetching, eviction, and space allocation accordingly. Experimental results show that IGTCache increases the cache hit ratio by 55.6% over state-of-the-art caching frameworks, reducing the overall job completion time by 52.2%.

large language model, machine learning, workload, (24 more...)

arXiv.org Artificial Intelligence

2506.1237

Country:

Asia (0.28)
Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

Add feedback

POLARON: Precision-aware On-device Learning and Adaptive Runtime-cONfigurable AI acceleration

Lokhande, Mukul, Vishvakarma, Santosh Kumar

arXiv.org Artificial IntelligenceJun-11-2025

--The increasing complexity of AI models requires flexible hardware capable of supporting diverse precision formats, particularly for energy-constrained edge platforms. This work presents PARV-CE, a SIMD-enabled, multi-precision MAC engine that performs efficient multiply-accumulate operations using a unified data-path for 4/8/16-bit fixed-point, floating-point, and posit formats. The architecture incorporates a layer-adaptive precision strategy to align computational accuracy with workload sensitivity, optimizing both performance and energy usage. The results demonstrate up to 2 improvement in PDP and 3 reduction in resource usage compared to SoT A designs, while retaining accuracy within 1.8% FP32 baseline. The architecture supports both on-device training and inference across a range of workloads, including DNNs, RNNs, RL, and Transformer models. The empirical analysis establish PARV-CE incorporated POLARON as a scalable and energy-efficient solution for precision-adaptive AI acceleration at edge.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.08785

Genre: Research Report (0.70)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

ADA: Automated Moving Target Defense for AI Workloads via Ephemeral Infrastructure-Native Rotation in Kubernetes

Sheriff, Akram, Huang, Ken, Nemeth, Zsolt, Nakhjiri, Madjid

arXiv.org Artificial IntelligenceJun-2-2025

This paper introduces the Adaptive Defense Agent (ADA), an innovative Automated Moving Target Defense (AMTD) system designed to fundamentally enhance the security posture of AI workloads. ADA operates by continuously and automatically rotating these workloads at the infrastructure level, leveraging the inherent ephemerality of Kubernetes pods. This constant managed churn systematically invalidates attacker assumptions and disrupts potential kill chains by regularly destroying and respawning AI service instances. This methodology, applying principles of chaos engineering as a continuous, proactive defense, offers a paradigm shift from traditional static defenses that rely on complex and expensive confidential or trusted computing solutions to secure the underlying compute platforms, while at the same time agnostically supporting the latest advancements in agentic and nonagentic AI ecosystems and solutions such as agent-to-agent (A2A) communication frameworks or model context protocols (MCP). This AI-native infrastructure design, relying on the widely proliferated cloud-native Kubernetes technologies, facilitates easier deployment, simplifies maintenance through an inherent zero trust posture achieved by rotation, and promotes faster adoption. We posit that ADA's novel approach to AMTD provides a more robust, agile, and operationally efficient zero-trust model for AI services, achieving security through proactive environmental manipulation rather than reactive patching.

cloud computing, machine learning, mitigation, (18 more...)

arXiv.org Artificial Intelligence

2505.23805

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Arm's 2025 CPU plans include a big push in PC performance

PCWorldJan-17-2025, 14:00:00 GMT

You would think that Arm, which arguably has been the vanguard in the smartphone and PC industry push for improved power efficiency, would double down on that strategy in its plans for 2025. PCWorld sat down at CES 2025 with Chris Bergey, senior vice president and general manager for Arm's client line of business. Bergey is responsible for both the smartphone as well as the laptop and tablet business, where Arm's designs are licensed by companies like Qualcomm and Apple, who tweak and eventually manufacture them as finished goods. Arm provides multiple types of licenses, but the two most common types are a core license, where a customer will buy a verified core that includes an Arm Cortex CPU, Mali GPU, or other intellectual property. Arm also sells architectural licenses to companies like Apple, which gives them the freedom to design their own cores from scratch, though they must be fully compatible with the Arm architecture.

artificial intelligence, bergey, qualcomm, (13 more...)

PCWorld

Country: Africa > Mali (0.25)

Industry: Law > Intellectual Property & Technology Law (0.36)

Technology:

Information Technology > Artificial Intelligence (0.54)
Information Technology > Hardware (0.39)
Information Technology > Graphics (0.39)

Add feedback

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?

Canakci, Burcu, Liu, Junyi, Wu, Xingbo, Cheriere, Nathanaël, Costa, Paolo, Legtchenko, Sergey, Narayanan, Dushyanth, Rowstron, Ant

arXiv.org Artificial IntelligenceJan-17-2025

To match the blooming demand of generative AI workloads, GPU designers have so far been trying to pack more and more compute and memory into single complex and expensive packages. However, there is growing uncertainty about the scalability of individual GPUs and thus AI clusters, as state-of-the-art GPUs are already displaying packaging, yield, and cooling limitations. We propose to rethink the design and scaling of AI clusters through efficiently-connected large clusters of Lite-GPUs, GPUs with single, small dies and a fraction of the capabilities of larger GPUs. We think recent advances in co-packaged optics can be key in overcoming the communication challenges of distributing AI workloads onto more Lite-GPUs. In this paper, we present the key benefits of Lite-GPUs on manufacturing cost, blast radius, yield, and power efficiency; and discuss systems opportunities and challenges around resource, workload, memory, and network management.

large language model, lite-gpus, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.10187

Country:

North America > United States (0.70)
Europe (0.46)

Genre: Research Report (0.66)

Industry: Information Technology (0.98)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback